Learning to learn by gradient descent by gradient descent

نویسندگان

Marcin Andrychowicz

Misha Denil

Sergio Gomez Colmenarejo

Matthew W. Hoffman

David Pfau

Tom Schaul

Nando de Freitas

چکیده

The move from hand-designed features to learned features in machine learning has been wildly successful. In spite of this, optimization algorithms are still designed by hand. In this paper we show how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way. Our learned algorithms, implemented by LSTMs, outperform generic, hand-designed competitors on the tasks for which they are trained, and also generalize well to new tasks with similar structure. We demonstrate this on a number of tasks, including simple convex problems, training neural networks, and styling images with neural art.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Handwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns

The purpose of this study is to analyze the performance of Back propagation algorithm with changing training patterns and the second momentum term in feed forward neural networks. This analysis is conducted on 250 different words of three small letters from the English alphabet. These words are presented to two vertical segmentation programs which are designed in MATLAB and based on portions (1...

متن کامل

A Note on the Descent Property Theorem for the Hybrid Conjugate Gradient Algorithm CCOMB Proposed by Andrei

In [1] (Hybrid Conjugate Gradient Algorithm for Unconstrained Optimization J. Optimization. Theory Appl. 141 (2009) 249 - 264), an efficient hybrid conjugate gradient algorithm, the CCOMB algorithm is proposed for solving unconstrained optimization problems. However, the proof of Theorem 2.1 in [1] is incorrect due to an erroneous inequality which used to indicate the descent property for the s...

متن کامل

An eigenvalue study on the sufficient descent property of a‎ ‎modified Polak-Ribière-Polyak conjugate gradient method

‎Based on an eigenvalue analysis‎, ‎a new proof for the sufficient‎ ‎descent property of the modified Polak-Ribière-Polyak conjugate‎ ‎gradient method proposed by Yu et al‎. ‎is presented‎.

متن کامل

Extensions of the Hestenes-Stiefel and Polak-Ribiere-Polyak conjugate gradient methods with sufficient descent property

Using search directions of a recent class of three--term conjugate gradient methods, modified versions of the Hestenes-Stiefel and Polak-Ribiere-Polyak methods are proposed which satisfy the sufficient descent condition. The methods are shown to be globally convergent when the line search fulfills the (strong) Wolfe conditions. Numerical experiments are done on a set of CUTEr unconstrained opti...

متن کامل

Forecasting GDP Growth Using ANN Model with Genetic Algorithm

Applying nonlinear models to estimation and forecasting economic models are now becoming more common, thanks to advances in computing technology. Artificial Neural Networks (ANN) models, which are nonlinear local optimizer models, have proven successful in forecasting economic variables. Most ANN models applied in Economics use the gradient descent method as their learning algorithm. However, t...

متن کامل

Learning to Learn without Gradient Descent by Gradient Descent

We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-paramete...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Learning to learn by gradient descent by gradient descent

نویسندگان

چکیده

منابع مشابه

Handwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns

A Note on the Descent Property Theorem for the Hybrid Conjugate Gradient Algorithm CCOMB Proposed by Andrei

An eigenvalue study on the sufficient descent property of a‎ ‎modified Polak-Ribière-Polyak conjugate gradient method

Extensions of the Hestenes-Stiefel and Polak-Ribiere-Polyak conjugate gradient methods with sufficient descent property

Forecasting GDP Growth Using ANN Model with Genetic Algorithm

Learning to Learn without Gradient Descent by Gradient Descent

عنوان ژورنال:

اشتراک گذاری